The combination of machine learning models with physical models is a recent research path to learn robust data representations. In this paper, we introduce p$^3$VAE, a generative model that integrates a perfect physical model which partially explains the true underlying factors of variation in the data. To fully leverage our hybrid design, we propose a semi-supervised optimization procedure and an inference scheme that comes along meaningful uncertainty estimates. We apply p$^3$VAE to the semantic segmentation of high-resolution hyperspectral remote sensing images. Our experiments on a simulated data set demonstrated the benefits of our hybrid model against conventional machine learning models in terms of extrapolation capabilities and interpretability. In particular, we show that p$^3$VAE naturally has high disentanglement capabilities. Our code and data have been made publicly available at https://github.com/Romain3Ch216/p3VAE.
translated by 谷歌翻译
多模式信息在医疗任务中经常可用。通过结合来自多个来源的信息,临床医生可以做出更准确的判断。近年来,在临床实践中使用了多种成像技术进行视网膜分析:2D眼底照片,3D光学相干断层扫描(OCT)和3D OCT血管造影等。我们的论文研究了基于深度学习的三种多模式信息融合策略,以求解视网膜视网膜分析任务:早期融合,中间融合和分层融合。常用的早期和中间融合很简单,但不能完全利用模式之间的互补信息。我们开发了一种分层融合方法,该方法着重于将网络多个维度的特征组合在一起,并探索模式之间的相关性。这些方法分别用于使用公共伽马数据集(Felcus Photophs和OCT)以及Plexelite 9000(Carl Zeis Meditec Inc.)的私人数据集,将这些方法应用于青光眼和糖尿病性视网膜病变分类。我们的分层融合方法在病例中表现最好,并为更好的临床诊断铺平了道路。
translated by 谷歌翻译
纵向成像能够捕获静态解剖结构和疾病进展的动态变化,向早期和更好的患者特异性病理学管理。但是,检测糖尿病性视网膜病(DR)的常规方法很少利用纵向信息来改善DR分析。在这项工作中,我们调查了利用纵向诊断目的的纵向性质利用自我监督学习的好处。我们比较了不同的纵向自学学习(LSSL)方法,以模拟从纵向视网膜颜色眼底照片(CFP)进行疾病进展,以便使用一对连续考试来检测早期的DR严重性变化。实验是在有或没有那些经过训练的编码器(LSSL)的纵向DR筛选数据集上进行的,该数据集充当纵向借口任务。结果对于基线(从头开始训练)的AUC为0.875,AUC为0.96(95%CI:0.9593-0.9655 DELONG测试),使用p值<2.2e-16,在早期融合上使用简单的重置式结构,使用冷冻的LSSL重量,这表明LSSL潜在空间可以编码DR进程的动态。
translated by 谷歌翻译
干眼症(DED)的患病率为5%至50%,是眼科医生咨询的主要原因之一。 DED的诊断和定量通常依赖于通过缝隙灯 - 检查的眼表面分析。但是,评估是主观的且不可再生的。为了改善诊断,我们建议1)使用考试期间获得的视频记录在3D中跟踪眼表面,以及2)使用注册框架对严重程度进行评分。我们的注册方法使用无监督的图像到深度学习。这些方法从灯光和阴影中学习深度,并根据深度图估算姿势。但是,DED考试经历尚未解决的挑战,包括移动的光源,透明的眼组织等。为了克服这些挑战,我们为这些挑战并估算了自我动机,我们实施了联合CNN体系结构,具有多种损失,包括先前的已知信息,即通过眼睛的形状,即通过语义分割以及球体拟合。所达到的跟踪误差优于最先进的,其平均欧几里得距离低至我们的测试集中图像宽度的0.48%。该注册将DED严重性分类提高了0.20 AUC差异。拟议的方法是第一个通过单眼视频监督来解决DED诊断的方法
translated by 谷歌翻译
我们提出了一种基于最大平均差异(MMD)的新型非参数两样本测试,该测试是通过具有不同核带宽的聚合测试来构建的。这种称为MMDAGG的聚合过程可确保对所使用的内核的收集最大化测试能力,而无需持有核心选择的数据(这会导致测试能力损失)或任意内核选择,例如中位数启发式。我们在非反应框架中工作,并证明我们的聚集测试对Sobolev球具有最小自适应性。我们的保证不仅限于特定的内核,而是符合绝对可集成的一维翻译不变特性内核的任何产品。此外,我们的结果适用于流行的数值程序来确定测试阈值,即排列和野生引导程序。通过对合成数据集和现实世界数据集的数值实验,我们证明了MMDAGG优于MMD内核适应的替代方法,用于两样本测试。
translated by 谷歌翻译
Semi-supervised learning (SSL) provides an effective means of leveraging unlabeled data to improve a model's performance. This domain has seen fast progress recently, at the cost of requiring more complex methods. In this paper we propose FixMatch, an algorithm that is a significant simplification of existing SSL methods. FixMatch first generates pseudo-labels using the model's predictions on weaklyaugmented unlabeled images. For a given image, the pseudo-label is only retained if the model produces a high-confidence prediction. The model is then trained to predict the pseudo-label when fed a strongly-augmented version of the same image. Despite its simplicity, we show that FixMatch achieves state-of-the-art performance across a variety of standard semi-supervised learning benchmarks, including 94.93% accuracy on CIFAR-10 with 250 labels and 88.61% accuracy with 40 -just 4 labels per class. We carry out an extensive ablation study to tease apart the experimental factors that are most important to FixMatch's success. The code is available at https://github.com/google-research/fixmatch.
translated by 谷歌翻译
Semi-supervised learning has proven to be a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets. In this work, we unify the current dominant approaches for semi-supervised learning to produce a new algorithm, MixMatch, that guesses low-entropy labels for data-augmented unlabeled examples and mixes labeled and unlabeled data using MixUp. MixMatch obtains state-of-the-art results by a large margin across many datasets and labeled data amounts. For example, on CIFAR-10 with 250 labels, we reduce error rate by a factor of 4 (from 38% to 11%) and by a factor of 2 on STL-10. We also demonstrate how MixMatch can help achieve a dramatically better accuracy-privacy trade-off for differential privacy. Finally, we perform an ablation study to tease apart which components of MixMatch are most important for its success. We release all code used in our experiments. 1
translated by 谷歌翻译